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Abstract. We introduce a novel image decomposition package, Galphat, that 
provides robust estimates of galaxy surface brightness profiles using Bayesian Markov 
Chain Monte Carlo. The GALPHAT-determined posterior distribution of parameters 
enables us to assign rigorous statistical confidence intervals to maximum a posteriori 
estimates and to test complex galaxy formation and evolution hypotheses. We describe 
the Galphat algorithm, assess its performance using test image data, and demonstrate 
that it has sufficient speed for production analysis of a large galaxy sample. Finally 
we briefly introduce our ongoing science program to study the distribution of galaxy 
structural properties in the local universe using Galphat. 



1. What and Why is GALPHAT? 

Large photometric and spectroscopic surveys of galaxies [e.g. SDSS|3) and 2MASS|3)] 
continue to provide vast ensembles of galaxy images, but rigorous conclusions about 
galaxy formation and evolution hypotheses based on the full volume of information 
present serious computational and algorithmic challenges. For example, parametric 
models are widely used to derive galaxy structural parameters: brightness, size, profile 
shape and ellipticity. A recent study (jl|) presented a bulge-disc decomposition for 10 4 
nearby galaxies. However, an accurate decomposition is stymied by degeneracies in the 
parameter estimation itself. A Sersic profile has 8 parameters, and the topology of the 
likelihood function in this high-dimensional parameter space is too complex to visualize 
and hard to characterize robustly. In most previous galaxy decomposition analyses, 
the uncertainties of derived model parameters have not been carefully characterized. 
The correlations of physical properties and structural parameters of galaxies are usually 
assessed through scatter plots of the best-fit parameters (e.g. maximum a posteriori es- 
timates with formal inverse Hessian variance estimates). These correlations are subject 
to strong contamination by underlying systematic correlations of each model parameter. 

A Bayesian approach to parameter inference and hypothesis testing naturally ad- 
dresses these difficulties. The probability of model parameters (M) for a given data set 
(D), P(M\D), is related to the probability of the data given the model (the likelihood), 
P(D\M), and prior probability of the model, P(M), through Bayes theorem: 

P(M\D) = nD\M)nM) = P { D\M)P(M) 
V 1 ; P(D) J P(D\M)P(M)dM y ' 

In our galaxy decomposition problem, M is the vector of parameters describing the 
model and D is the image data. For a high-dimensional model space, direct integra- 
tion is infeasible, and one resorts to Markov Chain Monte Carlo (MCMC) methods. 
Although costly, sampling the posterior using MCMC returns a robust estimate of the 
model parameters. From this distribution, one may also define a best-fit parameter value 
and confidence bounds, but the real power in this approach is the natural description 
by the posterior distribution of any intrinsic correlation between model parameters. 
This power comes at a cost: Bayesian MCMC requires much more computation time 
than x 2 minimization. To make this feasible for present-day surveys with large num- 
bers of galaxies, we present a novel image decomposition package Galphat (GALaxy 
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PHotometric ATtributesQ) which uses a state-of-the-art Bayesian computation software 
package BIE (Bayesian Inference Engin^3) for sampling MCMC efficiently and incor- 
porates an optimized image processing algorithm to reduce the likelihood evaluation 
time. We will introduce the algorithm, test its performance, and present science using 
Galphat in the following sections. 

2. The GALPHAT Algorithm 

Given a parameter vector M, Galphat produces a likelihood function for an image 
(pixel data, mask, PSF, background) and the BIE samples the posterior for a prior dis- 
tribution using an MCMC algorithm. The BIE provides a choice of MCMC algorithm 
depending on the complexity of the problem. To date, we have found that a multiple 
chain algorithm with tempering (5j) is a good compromise between speed and flexibil- 
ity. For the tests described here, Galphat models the galaxy surface brightness using 
multiple 2D Sersic profiles with arbitrary ellipticities and position angles. For compu- 
tational efficiency, Galphat pre-generates two-dimensional cumulative distributions of 
Sersic profile with many different n using a rigorous error tolerance. By assigning pixel 
values by table interpolation, Galphat generates a model image with arbitrary axis ra- 
tio b/a. Image rotation is done in Fourier space using three shear operations; a rotation 
matrix is decomposed into three sequential shear matrices in the X,Y, and X directions 
and then each shear operation is carried out using the Fourier shift theorem ([31) . Then, 
the model image is convolved with a given PSF and an adjustable flux pedestal with 
a spatial gradient is added to model the sky background. Galphat uses a Poisson 
likelihood function to describe the photon counting process. A more detail description 
of the algorithm will be published in upcoming method papers (@) and (@). 

3. GALPHAT performance 

For testing, we generate an ensemble of synthetic galaxy images over a wide range 
of signal-to-noise ratios S/N and sizes using the IDL program by Haufiler|l). Gal- 
phat successfully recovers the input parameters with robust statistical confidence in- 
tervals (CL). 

FIGURE Q] compares the Galphat posterior median and 99.7% CL with the true 
magnitude, effective radius r e , Sersic index n and axis ratio b/a as a function of S/N. 
We define S/N as the ratio of the average flux per pixel from the source within r e 
to the average flux from noise [following Haufiler et al.|0)]. We use 10,000 converged 
MCMC samples for characterizing the posterior. As S/N decreases, the 99.7% CL range 
increases. However Galphat is robust and encloses true parameter within the 99.7% 
CL even in the extreme case of S/N~ 1. 

To assess the performance for bulge-disk decomposition, we use simulated galaxies 
with a Sersic bulge (n = 4) and an exponential disk (n = 1). Figure [2] compares 
the Galphat parameter inferences to the exact values with varying bulge-to-total flux 
ratio B/T. As the bulge fraction increases, the inference of bulge parameters becomes 
more reliable and the inference of disk parameters becomes less reliable, as expected. 
The marginalized posterior distribution for pairs of parameters show that correlations 
between parameters abound; the magnitude of the correlations depend on B/T. Clearly, 
a characterization of this parameter correlation is necessary for interpreting the distri- 
bution of these parameters for the galaxy population overall and for testing hypotheses 
of galaxy formation and evolution. Galphat provides the full posterior probability 
distribution of model parameter and enables us to quote reliable uncertainties for each 



1 Galphat web page: http://sites.google.com/site/galphat/galphat 
2 For more information on BIE, see the BIE web page l[8|) and (Q) 
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Figure 1. The differences between Galphat posterior medians and the exact values 
as a function of S/N for Sersic n — 4 profiles with b/a = 1. Error bars are 99.7% CL. 
The Galphat inference is unbiased until S/N~ 3. The systematically lower estimate 
of b/a owes to the uniform prior with a range of [0,1]. 
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Figure 2. The differences between Galphat posterior medians and the exact values 
for bulge-disk galaxies as a function of fb —B/T. Error bars are 99.7% CL. From the 
top left to the bottom right panel, corresponding parameters are bulge magnitude, 
bulge r e , bulge n, bulge b/a, disk magnitude, disk r e , disk b/a, and sky background. 
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model parameter by including all the systematic correlations. Furthermore, the pos- 
terior distribution enables the computation of higher-order statistics and more general 
model comparisons. 

Convergence may be slow in high-dimensional parameter spaces owing to the com- 
plexity of the likelihood function. We have found it productive to iteratively add image 
information using a hierarchy of successively aggregated images. Beginning with the 
most aggregated image (Level 0) one computes the posterior, P{9\D ). The posterior 
for the next level (Level 1) is P(0|Di) oc P{e\D )[P{D 1 \9)/ P{D \6)} and so on. Us- 
ing this hierarchical data structure, Galphat reduces the run time by factors of two, 
depending on the level of aggregation, by accelerating convergence. Also, the BIE check- 
points the full state of the simulation, efficiently enabling the posterior distribution to 
be augmented at a later date. 

Galphat can be run on either a single CPU or in a cluster environment. Table [T] 
provides Galphat runtimes for a simple one-component decomposition. Of course, 
in addition to computing hardware, the time for your parameter estimate will depend 
on image size, characteristics of the model, the MCMC algorithm, the convergence 
diagnostic method, and the required number of posterior samples. More detailed results 
will be shown in the methods paper ([H). 



Table 1. wall clock time for Galphat 



Image 


samples 


CPU 


Processors Level 


runtime 


200 by 200 
single Sersic 


20,000 


AMD Athlon(tm) MP 1800+ 
1533 MHz 


8 1 


2 hr 


200 by 200 
single Sersic 


20,000 


AMD Athlon(tm) MP 1800+ 
1533 MHz 


8 2 


40 min 



4. Ongoing science program using GALPHAT 

We are currently performing a bulge+disk model decomposition of 2MASS _KT s band 
selected galaxies with 7 < K s < 11 to derive luminosity functions for each component 
in the present day Universe. We will investigate the effect of environment on these 
distributions. From these, we can compare to mass assembly histories predicted from 
hierarchical galaxy formation theories. 
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